Goto

Collaborating Authors

 system time


Balancing Information Accuracy and Response Timeliness in Networked LLMs

Turkmen, Yigit, Buyukates, Baturalp, Bastopcu, Melih

arXiv.org Artificial Intelligence

Recent advancements in Large Language Models (LLMs) have transformed many fields including scientific discovery, content generation, biomedical text mining, and educational technology. However, the substantial requirements for training data, computational resources, and energy consumption pose significant challenges for their practical deployment. A promising alternative is to leverage smaller, specialized language models and aggregate their outputs to improve overall response quality. In this work, we investigate a networked LLM system composed of multiple users, a central task processor, and clusters of topic-specialized LLMs. Each user submits categorical binary (true/false) queries, which are routed by the task processor to a selected cluster of $m$ LLMs. After gathering individual responses, the processor returns a final aggregated answer to the user. We characterize both the information accuracy and response timeliness in this setting, and formulate a joint optimization problem to balance these two competing objectives. Our extensive simulations demonstrate that the aggregated responses consistently achieve higher accuracy than those of individual LLMs. Notably, this improvement is more significant when the participating LLMs exhibit similar standalone performance.


Segmentation analysis and the recovery of queuing parameters via the Wasserstein distance: a study of administrative data for patients with chronic obstructive pulmonary disease

Wilde, Henry, Knight, Vincent, Gillard, Jonathan, Smith, Kendal

arXiv.org Machine Learning

However, many such methods rely heavily on detailed data about both the healthcare system and its population which may limit research where sophisticated data pipelines are not yet in place. This work demonstrates a method of overcoming this, using routinely gathered, administrative hospital data to build a clustering that feeds into a multi-class queuing model, allowing for better understanding of the healthcare population and the system with which they interact. Specifically, this work examines records of patient spells from the National Health Service (NHS) Wales Cwm Taf Morgannwg University Health Board (UHB) presenting chronic obstructive pulmonary disease (COPD). COPD is a condition of particular interest to population health research, and to Cwm Taf Morgannwg UHB, as it is known to often present as a comorbidity in patients [15], increasing the complexity of treatments among those with the condition. Moreover, an internal report by NHS Wales found the Cwm Taf Morgannwg UHB had the highest prevalence of the condition across all the Welsh health boards. This work draws upon several overlapping sources within mathematical research, and this work contributes to the literature in three ways: to theoretical queuing research by the estimation of missing queuing parameters with the Wasserstein distance; to operational healthcare research through the weaving together of the combination of methods used in this work despite data constraints; and to public health research by adding to the growing body of mathematical and operational work around a condition that is vital to understand operationally, socially and medically. The remainder of the paper is structured as follows: Section 1 provides a literature review, and an overview of the dataset and its clustering; Section 2 describes the queuing model used and the estimation of its parameters; Section 3 presents several what-if scenarios with insight provided by the model parameterisation and the clustering; Section 4 concludes the paper. Although the data is confidential and may not be published, a synthetic analogue has been archived [43] along with all the source code used in this paper [40].


Towards Unifying Hamiltonian Monte Carlo and Slice Sampling

Zhang, Yizhe, Wang, Xiangyu, Chen, Changyou, Henao, Ricardo, Fan, Kai, Carin, Lawrence

Neural Information Processing Systems

We unify slice sampling and Hamiltonian Monte Carlo (HMC) sampling, demonstrating their connection via the Hamiltonian-Jacobi equation from Hamiltonian mechanics. This insight enables extension of HMC and slice sampling to a broader family of samplers, called Monomial Gamma Samplers (MGS). We provide a theoretical analysis of the mixing performance of such samplers, proving that in the limit of a single parameter, the MGS draws decorrelated samples from the desired target distribution. We further show that as this parameter tends toward this limit, performance gains are achieved at a cost of increasing numerical difficulty and some practical convergence issues. Our theoretical results are validated with synthetic data and real-world applications.